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Amid the implementation of new curriculum standard regarding statistics and new recommendations 
for preservice secondary mathematics teachers [PSMTs] to teach statistics, there is a need to. 
examine the current state of PSMTs' common statistical knowledge. This study reports on the 
statistical knowledge 217 PSMTs from a purposeful sample of 18 universities across the United 
States. The results show that PSMTs may not have strong common statistical knowledge that is 
needed to teach statistics to high school students. PSMTs' strengths include identifving appropriate 
measures of center, while weakni 
values, and confidence intervals. 
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Many have argued the need to increase students’ understanding of statistics (Shaughnessy, 2007). 
Accordingly, there has been recent increased emphasis on statistics content in secondary curricula 
standards in the U.S., informed by recommendations from the National Council of Teachers of 
Mathematics (2000) and the Common Core Standards for Mathematics (National Governors 
Association Center for Best Practice & Council of Chief State School Officers, 2010), However, a 
recent study of 1,249 high school students in the U.S, suggests that students are not developing a 
conceptual understanding of statistics (Jacobbe, Foti, Case, & Whitaker, 2014). Since many teachers, 
including preservice secondary mathematics teachers (PSMTs), have likely had minimal experience 
with statistics in their own K-12 education, they also may not have had many opportunities to 
develop strong statistical understandings. 

The Conference Board of the Mathematical Sciences (2001, 2012) as well as the American 
Statistical Association (ASA, Franklin et al., 2015), present recommendations for developing 
statistical knowledge and pedagogy needed by preservice mathematics teachers to teach statistics. 
However, the lack of research focusing on the statistical knowledge of PSMTs was highlighted and 
called for in the 2011 International Congress of Mathematics Education Topical Study (Batanero, 
Burrill, & Reading, 2011). The majority of research on preservice teachers’ statistical knowledge has 
focused on elementary teachers (e.g. Browning, Gross, & Smith, 2014; Hu, 2015; Leavy & 
O'Loughlin, 2006), The limited research conducted on PSMTs'’ statistical knowledge has been 
small-scale studies, from a small number of institutions on specific statistical content (e.g., Doerr & 
Jacob, 2011; Lesser, Wagler, & Abormegah, 2014), While some smaller studies have suggested that 
PSMTs may struggle with statistics (e.g. Casey & Wasserman, 2015), there are no large-scale studies 
that describe the current state of new teachers’ statistical knowledge. This study examines the 
statistical knowledge of a large cross-institutional sample of PSMTs as they enter student teaching to 
answer the question: What are the strengths and weaknesses of PSMTs’ knowledge of the statistical 
content they will be expected to teach’? 


Framework 

Groth (2013) developed a hypothetical framework for Statistical Knowledge for Teaching 
consisting of two domains of knowledge teachers need to develop: subject matter knowledge and 
pedagogical content knowledge. Developing subject matter knowledge and key developmental 
understandings of statistics is foundational to be able to develop pedagogical statistical knowledge. 
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Within subject matter knowledge there are three types: common content knowledge, specialized 
content knowledge and horizon knowledge. Common content knowledge refers to knowledge gained 
through statistics taught in school and is considered common because it refers to knowledge for daily 
literacy or in any profession that uses statistics. This study examines the common statistical 
knowledge of PSMTs since they will soon be expected to teach these common statistical ideas as part 
of curricula for high school students. 

The Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report: A Pre-K- 
12 Curriculum Framework (GAISE, Franklin et al., 2007) describes statistical reasoning students 
should develop in K-12 and suggests this reasoning develops across three levels A, B, and C. 
Although there are not explicit definitions given for statistical reasoning in each level, the levels 
increase in statistical sophistication and become more abstract. The content in Level A represents 
topics for early or novice learners of statistics (elementary and middle school), Level B represents 
slightly more advanced statistical content (middle school or early high school), and Level C 
represents even more advanced content (high school or introductory college courses) (Franklin et al., 
2007). The GAISE report recommends that students learn statistical topics through engaging in a 
statistical investigative cycle consisting of: posing questions, collecting data, analyzing data, and 
interpreting results. Therefore, when examining PSMTs’ common statistical knowledge, it is useful 
to consider their understandings across these cycle phases and all three GAISE levels. 


Methodology 


Participating Institutions 

This study focuses on PSMTs prepared through university-based teacher preparation programs in 
the US. Since a random sample of all mathematics teacher preparation programs was unavailable, 
this study began with a purposeful narrowing on PSMTs who attend institutions in which some 
faculty have participated in the last 13 years in particular National Science Foundation (NSF)-funded 
or ASA-funded programs to increase the emphasis of statistics education at that institution, Faculty 
from 57 institutions participated in the NSF-funded program, Preparing to Teach Mathematics with 
Technology (PTMT, ptmt.fi,ncsu.edu), and/or the ASA-funded Math/Stat Teacher Education: 
Assessment, Methods, and Strategies (TEAMS, 
www.amstat.org/sections/edue/newsletter/v9nl/TEAMS. htm!) conference between 2002-2014. 

These institutions were chosen since faculty members received professional development focusing on 
explicit content and strategies for preparing PSMTSs to teach statistics. Our assumption was that 
PSMTs from these institutions may have had opportunities to engage in statistics content and 
pedagogy activities in their coursework, 

The sample was obtained by contacting all 57 institutions through their undergraduate program 
coordinator for mathematics education to inquire if the program was interested in participating, 
Twenty-four programs expressed interest, and 18 participated. The coordinator identified the last 
mathematics teaching methods course PSMTS take before student teaching, which would constitute 
the data collection point in either fall 2014 or spring 2015. Of the 18 institutions, all but one were 
public institutions. The majority of institutions (61.1%) had an Carnegie Classification™ (Carnegie 
Foundation for the Advancement of Teaching, 2011) enrollment profile of high undergraduate. 
Approximately 84% of participants attended institutions with a basic classification of Research 
Universities/Very High, Research University/High or a Master’s college and university with a larger 
program, 


Participants 
Across 18 institutions, there were 221 PSMTs recruited by their mathematics teaching methods 
instructor to take the assessment of their statistical understanding, described in the next section, as an 
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assignment as part of the course. Those who took exceptionally less time (10 minutes) than 
recommended by authors of the assessment were eliminated (Jacobbe, personal communication). 
This resulted in a sample size of 217 PSMTs. The PSMTs were undergraduate juniors and seniors, or 
graduate students earning initial licensure; all were enrolled in their last mathematics education 
course prior to student teaching, The number of PSMTs participating from each institution ranged 
from 2 to 31, with a mean of 12. Fourteen institutions had 100% participation of PSMTs who were 
eligible to participate, with the remaining four institutions having between one and four students who 
did not complete the assignment. The majority of PSMTs were female (71%), and 88% were 
Caucasian, Almost all (93.4%) reported they had taken at least one statistics course at their institution 
or had completed Advanced Placement Statistics in high school. 


Data Collection and Analysis 

To examine PSMTs’ common statistical knowledge, the Levels of Conceptual Understanding of 
Statistics (LOCUS) assessment (Jacobbe, Case, Whitaker, & Foti, 2014) was administered online 
(locus.statistieseducation.org). The LOCUS instrument assesses understanding of statistics across the 
three GAISE levels of development and also assesses understanding within each phase of an 
investigative cycle: formulating questions, collecting data, analyzing data, and interpreting results 
Participants took the 30 multiple choice Intermediate/Advanced Statistical Literacy version of the 
assessment, which was designed for students in grades 10 ~ 12, The test consists of two level A 
questions, 11 level B questions, and 17 level C questions. This version has been validated and 
reliable with students in grades 6-12 to assess statistical knowledge across levels B and C and the 
four phases of the investigative cycle (Jacobbe, personal communication); while this instrument is 
not intended as a high stakes assessment of knowledge, it does represent the statistics content PSMTs 
are expected to teach their students in the near future, Thus, teachers are expected to score fairly high 
on the assessment, While actual test items cannot be released due to test security, sample items for 
the four categories at different levels are available on the LOCUS website 
(locus.statistieseducation.org/professional-development), Each test-taker receives an overall score 
(percent correct), as well as sub-scores for Level B, Level C, Formulating Questions, Collecting 
Data, Analyzing Data, and Interpreting Results. 

To examine the statistical knowledge demonstrated by PSMTs, descriptive statistics were 
computed for the overall score and each subscore. Paired samples t-tests were used to test for 
significance of PSMTS’ statistical knowledge between GAISE Levels B and C and a repeated 
measures ANOVA used to test for significant differences in PSMTS’ statistical knowledge between 
the four phases of a statistical investigation. An item analysis was conducted to closely examine 
PSMTS’ strengths and weaknesses. 


Results 
Trends in scores on the LOCUS test can help in describing what PSMTs from these 18 
universities currently understand about the statistics content they will soon be responsible for 
teaching. The summary statistics for PSMTs’ scores are reported in table 1. With a mean overall 
score of 69%, and a standard deviation of 14.06, PSMTs do not seem to demonstrate a conceptual 
understanding of the statistical content they will teach high school students. PSMTs scored, on 
average, significantly higher on Level B questions than on Level C questions (t=5.772, p<0.001), 
demonstrating that their statistical knowledge is weaker as items increase in sophistication. The 
distribution of PSMTs’ scores is shown in figure 1. The boxplots show that for the overall scores and 
subscores, there are at least some PSMTs who scored between 90-100% correct, indicating that they 
likely have strong common statistical knowledge of topics they will soon be responsible to teach. 
However, there is a concern since only one-quarter of PSMTs scored overall above 77%, and a 
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quarter scored below 57% overall, The variation in scores seems somewhat similar for Level C 
scores, However, higher standard deviation in Level B scores is likely due to the increased quantity 
of low scoring individuals, indicated as outliers in figure 1. 


Table 1: PSMTs’ Percent Correct on LOCUS Instrument 


Number of Mean SD 
items 
Overall Score 30 68.61 14.06 
GAISE Levels 
Level B Score ul 70.85 17.69 
Level C Score 17 64.87 14.16 
Phases of Statistical Investigative 
Cyele 5 80.37 21.51 
Formulating Questions 7 70.40 19.70 
Collect Data 7 63.34 22.22 
Analyze Data ul 60.48 16.25 
Interpret Results 
a 
L Fe EI sett ir a 


Figures 1 and 2. Distribution of PSMTs’ LOCUS scores 


Examining subscores by phases in the statistical investigative cycle, PSMTs scored higher on 
average on Formulating Questions and lower as the cycle progresses, scoring lowest on Interpreting 
Results items (Table 1), A repeated measures ANOVA determined that mean scores differed 
significantly between scores for the four phases [F(3,648)=64.73, p<0.001]. Post hoc tests using a 
Bonferroni correction revealed that PSMTs scored significantly lower as the cycle progressed 
(p<0.001). However, there was only a slight difference between mean scores for Analyze Data and 
Interpret Results (p=0.32). The distribution of scores across the four phases is shown in figure 2. The 
boxplots show that for all four phases, there are again some PSMTs who scored between 90-100%, 
indicating that those PSMTs likely have the common content knowledge that will be needed when 
teaching that phase of the investigative cycle. On Formulating Questions items, at least half of 
PSMTs scored 80% or higher, and a quarter of those scored 100%, indicating stronger understanding 
for these PSMTs about Formulating Questions. However, half of PSMTs scored below 71% on 
Collecting Data and Analyzing Data items, and half scored below 64% on Interpreting Results items. 
Even being conservative, this result is convincing that the majority of these PSMTs do not have the 
common statistical knowledge that can provide a foundation for teaching students key concepts 
related to Collecting Data, Analyzing Data, and Interpreting Results. 
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Item Analysis 

Upon further analysis of individual items classified by the statistical investigative cycle, themes 
emerged concerning PSMTS’ strengths and weakness. As previously mentioned, PSMTs scored the 
highest on average for Formulating Questions items, with no common misunderstanding identified. 
As an example of their strength in understanding this phase, PSMTs were able to read a description 
of a study and measurements taken to identify an appropriate statistical question of interest. 

Collecting Data. On average, PSMTs scored the next highest on Collecting Data items. PSMTs 
were able to identity ways to improve a study design given a study and measurements, identify which 
study design would be best based on a question of interest, and identify a data collection plan based 
ona study description. Thus, these PSMTs seem to have strong common content knowledge related 
to the design of a statistical study. 

Even though PSMTs were able to develop a data collection plan, they struggled more when asked 
to identify how to choose a sample to minimize bias. Only 64.5% were able to choose a correct 
sampling method; instead, 30% chose a convenience sample or a stratified sample that seemed 
complicated but was not random, Thus, they do not seem to have a strong understanding of the role 
of an appropriate sampling method within the design of a study. Another common misunderstanding 
of PSMTs was the conclusion that could be drawn from a specific study design. Figure 3 is a similar 
item to the one PSMTs were asked on the assessment, Over 58% of PSMTs chose an answer similar 
to answers (A) and (C) that allowed a researcher to generalize results to an entire population based on 
a sample of volunteers. These findings highlight PSMTs’ need for a deeper understanding related to 
ways in which study designs and data collection processes impact the conclusions that can be drawn, 


Each member of a random sample of 1,000 adult males ftom the United States was asked 
‘number of questions, including questions about height and annual income, When the 
responses were analyzed, it was determined that taller men had greater incomes than shorter 
‘men, on average, and the diference was statistical significant, Which ofthe following 
conclusions would be most appropriate based on these resuls? 


(A) The study estabishes that being tal causes men to have greater incomes, on average, 
‘and this conctusion can be generalized to all men in the United States. 

(8) The study estabishes that being tall causes men to have greater incomes, on average, but 
this result only applies to the men in the sample. 

(C)The study estabishes that being tal causes men to have greater incomes, on average, 
than shorter men, and this conciusion can be generalized to all men inthe United States, 


(0)The study establishes that being tall causes men to have greater incomes, on average, 
than shorter men, bu this result only applies to the men in the sample. 


Figure 3. Sample Collect Data item from locus.statisticseducation.org, 


Analyzing Data, PSMTs’ average scores for Analyzing Data items were the second lowest 
among the phases, and had the highest variability. PSMTs demonstrated that they understand which 
measure of center is appropriate for a given context, how measures of center and variation change 
when data values are changed, and a justification of an association from a two-way table. However, 
PSMTs demonstrated more difficulty with Analyze Data items that involved understanding of 
variation in data. Only 43% of PSMTs could identify a histogram containing data that varied the least 
from its mean, Instead 30% of PSMTs chose a uniform distribution and about 20% thought 
variability from the mean was the same for all three distributions, PSMTs demonstrated another 
misunderstanding related to expected variation in sample means when repeatedly sampling from a 
population, When given the distribution of a population and population mean, 36% of PSMTs could 
not identify the distribution of sample means. Instead they chose distributions that resembled the 
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general shape of population distribution. These results point to PSMTs" need for more common 
content knowledge in regards to variation, sample distributions, and distribution of sample statistics. 

Interpret Results. PSMTs scored the lowest on average on Interpret Results items, However, on 
five of the eleven Interpret Results items, 84% or more of PSMTs answered the items correctly. 
PSMTs were able to compare distributions in a context using the center and spread, demonstrate an 
understanding of the effect of sample size on a sample mean, and interpret survey results with a 
given margin of error. These are important concepts often taught in middle and high school curricula. 
On the other six Interpret Results items, the percentage of PSMTs responding correctly to these items 
ranged from 21% to 48%, and their misunderstandings were related to ideas of formal inference. 
PSMTs struggled most with statistical significance, identifying and interpreting a p-value, and 
explaining confidence intervals. About half (48%) of PSMTs were able to correctly interpret results 
given a large p-value and fail to reject the null hypothesis (Figure 4). However, 40% of PSMTs chose 
a conclusion that a large p-value meant they could reject the null hypothesis. 


lao wandered students who graduste roma norby clogs wih an 
ngineenig degree would have ger yearly salaries, of average, than students 
‘raduing win nsegreen mathematics To vse, he random sisted Yen 
Engineering gredunies and eght mathematics gradunte, Me ceicdate the masn 
‘yealy salary tobe 858,421 or he ton engineing grades ard $55,402 forthe 
ght mathematics students. The difarance inthe sample means of $19 per yaar 
Fesulted in a peaiue of 0.478, Based on tis nfommaton, wna of te folowing 
‘Salomon eco!” 


(A1The fernce in maans ie botn ately sgnticant and procticaly 
significant, 

(2) The aferance in means e nether statistcaly sigiicant ror pracally 
signin. 

(C)The dferonee in maans is practicaly significant, but not statically significant, 

{0)The aferce in moans is satstcaly sgnifani but not practically significant 


Figure 4. Sample Interpret Results item from locus.statisticseducation.org. 


On another item regarding p-value, PSMTs were asked to reason if a p-value would be large or 
small for comparing means of two distributions given data on a dotplot. Only 35% of PSMTs were 
able to correctly identify that the p-value would be small due to the large gap between distributions. 
Almost 47% incorrectly answered that the p-value would be large due to a large gap between the 
distributions. These findings demonstrate that PSMTs on average do not have an understanding of 
what it means to be statistically significant and what a p-value represents, aspects of common content 
knowledge expected in statistics, and included in many high school curricula, 

The item PSMTs had the most difficulty with in Interpreting Results asked the test taker to 
explain the meaning of a 95% confidence interval for a mean, Approximately one-fifth chose the 
correct response that a 95% confidence interval represents that 95% of confidence intervals 
constructed from random samples would capture the true mean, Almost half of PSMTs chose the 
response that there was a 95% probability that the mean was in between the lower and upper limits of 
the confidence interval. These misunderstandings highlight the need for PSMTs to have more 
experiences with interpreting and understanding confidence intervals. 


Discussion and Conclusion 

Our study was situated within a purposeful sample of PSMTs enrolled in teacher education 
programs where a faculty member had participated in professional development projects that 
promoted increasing attention to statistics in secondary mathematics education courses. It is not 
known exactly how those teacher education programs currently include an emphasis on statistics, nor 
exactly what these PSMTs experienced at all 18 institutions. Nonetheless, there are several findings 
of this study that are significant to consider, Our results provide empirical evidence that PSMTs in 
this study generally do not exhibit a strong common content knowledge of many aspects of statistics 
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needed for teaching high school students, and in particular they struggle more with the later phases of 
a statistical investigation, Previous research has shown a similar trend with inservice teachers and 
students measured by LOCUS (Jacobbe, 2015; Jacobbe, Foti, et al., 2014). Thus, PSMTs need more 
experiences in collecting data, analyzing data and interpreting results to develop a deeper 
understanding of all aspects of the statistical investigative cycle and to develop common statistical 
knowledge needed for teaching. 

PSMTs exhibit some similar strengths and weaknesses with concepts that high school and 
introductory college students develop. An important strength that PSMTs demonstrated is that they 
are proficient at identifying an appropriate measure of center for a given context, PSMTS’ strength in 
understanding measures of center suggests they should be well equipped to assist their future 
students develop stronger conceptions. PSMTS’ weaknesses involve issues with variability, sampling 
distributions, p-values, and confidence intervals. Many researchers have identified that these topics 
are also often misunderstood by many students in undergraduate statistics courses (e.g., Aquilonius 
& Brenner, 2015; Castro Sotos, Vanhoof, Van de Noortgate, & Onghena, 2007; delMas, Garfield, 
Ooms, & Chance, 2007); thus, PSMTs’ common statistical knowledge may be no better than those of 
other college students not preparing for teaching. 

These findings, even though from a purposeful sample, suggest there is a critical need for 
mathematics teacher education programs to reevaluate the opportunities PSMTS" have to increase 
their common statistical knowledge. Our results specifically indicate that effort should focus on 
developing PSMTs’ knowledge of variability, sampling distributions, and formal inference, 
particularly as they are applied in the analyzing data and interpreting results phases of an 
investigative cycle. While this study only reports on one aspect of PSMTs’ statistical knowledge for 
teaching, the larger study (Lovett & Lee, 2017) provides more details about PSMTS’ confidence to 
teach and the experiences they perceived had contributed to their confidence and understandings in 
statistics, Additional large-scale studies are needed on all aspects of PSMTs’ statistical knowledge 
for teaching and the impact that teacher education programs have on PSMTS’ preparedness to teach 
statistics. 
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